Enterprise Database Systems
Data Silos, Lakes, and Streams
Data Silos, Lakes, & Streams: Introduction
Data Silos, Lakes, & Streams: Sources, Visualizations, & ETL Operations
Data Silos, Lakes, and Streams: Data Lakes on AWS

Data Silos, Lakes, & Streams: Introduction

Course Number:
it_dsdslsdj_01_enus
Lesson Objectives

Data Silos, Lakes, & Streams: Introduction

  • Course Overview
  • recall the characteristics and drawbacks of data silos
  • specify what a data lake enables
  • recognize the advantages of using data lakes to store data
  • describe the architecture of a data lake and identify challenges in its design
  • recall the characteristics of a data warehouse
  • specify the differences between data warehouses and data lakes
  • distinguish between batch and streaming data and recognize the Stream-First Architecture
  • describe how data can be moved from on-premise to the AWS cloud platform
  • recognize the technologies used to build data lakes on AWS
  • describe various use cases and architectures of working with data lakes on AWS
  • recall characteristics of data silos, data lakes, and data streams

Overview/Description

This 11-video course discusses the transition of data warehousing to cloud-based solutions using the AWS (Amazon Web Services) cloud platform. You will examine various implications involved in storing different types of data from different sources within an organization. You will need to be familiar with provisioning and working with resources on the cloud, basic big data architecture, distributed systems, using shell commands, and a Linux terminal prompt. You will learn that an organization may have data silos which may prevent access to other teams within an organization. You will learn how to use data lakes, a centralized repository to store data at scale, and as a viable solution to data silos that might exist within an organization. You will learn the difference between a data lake which stores all kinds of raw data in a native format before the data has been processed, and a data warehouse which contains data that can be used so directly to generate business insights. Finally, this course demonstrates storing data with AWS Redshift data warehouse.



Target

Prerequisites: none

Data Silos, Lakes, & Streams: Sources, Visualizations, & ETL Operations

Course Number:
it_dsdslsdj_03_enus
Lesson Objectives

Data Silos, Lakes, & Streams: Sources, Visualizations, & ETL Operations

  • Course Overview
  • configure a Redshift cluster to store data
  • load data into a Redshift cluster from S3 buckets
  • configure a JDBC connection on Glue to the Redshift cluster
  • crawl data on a Redshift cluster using a Glue crawler
  • crawl data stored in a DynamoDB table
  • configure the Amazon QuickSight business intelligence tool to visualize data
  • build charts and dashboards in QuickSight
  • define a job in Glue to perform ETL operations
  • run ETL scripts using Glue
  • perform ETL operations in Glue to backup data originally stored in Redshift
  • perform ETL operations in Glue to backup data originally stored in DynamoDB
  • recall how to use AWS services for visualizations and ETL

Overview/Description

This course discusses the transition of data warehousing to cloud-based solutions using the AWS (Amazon Web Services) cloud platform. You will explore Amazon Redshift, a fully managed petabyte-scale data warehouse service which forms part of the larger AWS cloud-computing platform. The 12-video course demonstrates how to create and configure an Amazon Redshift cluster; to load data into it from an S3 (simple storage service) bucket; and configure a Glue crawler for stored data. This course examines how to visualize the data stored in the data lake and how to perform ETL (extract, transform, load) operations on the data using Glue scripts. You will work with the DynamoDB, a NoSQL database service that supports key-value and document data structures. You will learn how to use AWS QuickSight, a high-performance business intelligence service which integrates seamlessly with Glue tables by using the Amazon Athena Query Service. Finally, you will configure jobs to run extract, transform, and load operations on data stored in our data lake.



Target

Prerequisites: none

Data Silos, Lakes, and Streams: Data Lakes on AWS

Course Number:
it_dsdslsdj_02_enus
Lesson Objectives

Data Silos, Lakes, and Streams: Data Lakes on AWS

  • Course Overview
  • configure a custom role with specific permissions on AWS
  • create an S3 bucket and upload files
  • recognize the different operations that can be performed using the AWS Glue console
  • create metadata tables in Glue using the web console
  • perform queries on the Glue data catalog using Athena
  • perform data crawling on S3 to automatically detect schemas
  • execute queries on data in crawled tables
  • perform crawling operations with multiple files in the same path
  • merge data stored in multiple files in the same folder path
  • merge data when files have the exact same schema
  • recall the roles and features of the different AWS services used in the data lake architecture

Overview/Description

This course discusses the transition of data warehousing to cloud-based solutions using the AWS (Amazon Web Services) cloud platform. In 11 videos, the course explores how data lakes store data using a flat structure, and the data are tagged, making it easy to search and query. You will learn how to build a data lake on the AWS cloud by storing data in S3 (simple storage service) buckets. You will learn to set up your data lake architecture lake using AWS Glue, a fully managed ETL (extract, transform, load) service. You will learn to configure and run Glue crawlers, and you will examine how crawlers merge data stored in an S3 folder path; and to use S3 to generate metadata tables in Glue. Learners will use Athena, Amazon's interactive query service as a simple way to analyze data in S3 using standard SQL. Finally, you will examine how to merge the data crawled by our CSV (comma separated values) crawler into a single table.



Target

Prerequisites: none

Close Chat Live